NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Interacting Large Language Model Agents. Bayesian Social Learning Based Interpretable Models

https://doi.org/10.1109/ACCESS.2025.3538599

Jain, Adit; Krishnamurthy, Vikram (January 2025, IEEE Access)

This paper discusses the theory and algorithms for interacting large language model agents (LLMAs) using methods from statistical signal processing and microeconomics. While both fields are mature, their application to decision-making involving interacting LLMAs remains unexplored. Motivated by Bayesian sentiment analysis on online platforms, we construct interpretable models and stochastic control algorithms that enable LLMAs to interact and perform Bayesian inference. Because interacting LLMAs learn from both prior decisions and external inputs, they can exhibit bias and herding behavior. Thus, developing interpretable models and stochastic control algorithms is essential to understand and mitigate these behaviors. This paper has three main results. First, we show using Bayesian revealed preferences from microeconomics that an individual LLMA satisfies the necessary and sufficient conditions for rationally inattentive (bounded rationality) Bayesian utility maximization and, given an observation, the LLMA chooses an action that maximizes a regularized utility. Second, we utilize Bayesian social learning to construct interpretable models for LLMAs that interact sequentially with each other and the environment while performing Bayesian inference. Our proposed models capture the herding behavior exhibited by interacting LLMAs. Third, we propose a stochastic control framework to delay herding and improve state estimation accuracy under two settings: 1) centrally controlled LLMAs and 2) autonomous LLMAs with incentives. Throughout the paper, we numerically demonstrate the effectiveness of our methods on real datasets for hate speech classification and product quality assessment, using open-source models like LLaMA and Mistral and closed-source models like ChatGPT. The main takeaway of this paper, based on substantial empirical analysis and mathematical formalism, is that LLMAs act as rationally bounded Bayesian agents that exhibit social learning when interacting. Traditionally, such models are used in economics to study interacting human decision-makers.
more » « less
Full Text Available
QMCSoftware/QMCSoftware: QMCPy v2.0

https://doi.org/10.5281/zenodo.16822646

Sorokin, Aleksei G; Hickernell, Fred J; Jain, Adit; Choi, Sou-Cheng; Rathinavel, Jagadeeswaran; Zhang, David; Lin, Lijia; Robbe, Pieterjan; Pascual_Kwan, Ally; Haji-Ali, Abdul-Lateef; et al (August 2025, Zenodo)

What's Changed Hpc stabilize by @alegresor in https://github.com/QMCSoftware/QMCSoftware/pull/382 Update README.md by @zitterbewegung in https://github.com/QMCSoftware/QMCSoftware/pull/385 Geometric brownian motion by @larissensium in https://github.com/QMCSoftware/QMCSoftware/pull/392 QMCPy Overhaul by @alegresor in https://github.com/QMCSoftware/QMCSoftware/pull/391 v2.0 by @alegresor in https://github.com/QMCSoftware/QMCSoftware/pull/394 New Contributors @larissensium made their first contribution in https://github.com/QMCSoftware/QMCSoftware/pull/392 Full Changelog: https://github.com/QMCSoftware/QMCSoftware/compare/v1.6.1...v2.0
more » « less
Controlling Federated Learning for Covertness

Jain, Adit; Krishnamurthy, Vikram (February 2024, Transactions on machine learning research)

A learner aims to minimize a function f by repeatedly querying a distributed oracle that provides noisy gradient evaluations. At the same time, the learner seeks to hide arg min f from a malicious eavesdropper that observes the learner’s queries. This paper considers the problem of covert or learner-private optimization, where the learner has to dynamically choose between learning and obfuscation by exploiting the stochasticity. The problem of controlling the stochastic gradient algorithm for covert optimization is modeled as a Markov decision process, and we show that the dynamic programming operator has a supermodular structure implying that the optimal policy has a monotone threshold structure. A computationally efficient policy gradient algorithm is proposed to search for the optimal querying policy without knowledge of the transition probabilities. As a practical application, our methods are demonstrated on a hate speech classification task in a federated setting where an eavesdropper can use the optimal weights to generate toxic content, which is more easily misclassified. Numerical results show that when the learner uses the optimal policy, an eavesdropper can only achieve a validation accuracy of 52% with no information and 69% when it has a public dataset with 10% positive samples compared to 83% when the learner employs a greedy policy.
more » « less
Full Text Available
Structured Reinforcement Learning for Incentivized Stochastic Covert Optimization

https://doi.org/10.1109/LCSYS.2024.3406543

Jain, Adit; Krishnamurthy, Vikram (January 2024, IEEE Control Systems Letters)

This letter studies how a stochastic gradient algorithm (SG) can be controlled to hide the estimate of the local stationary point from an eavesdropper. Such prob- lems are of signiﬁcant interest in distributed optimization settings like federated learning and inventory management. A learner queries a stochastic oracle and incentivizes the oracle to obtain noisy gradient measurements and per- form SG. The oracle probabilistically returns either a noisy gradient of the function or a non-informative measure- ment, depending on the oracle state and incentive. The learner’s query and incentive are visible to an eavesdropper who wishes to estimate the stationary point. This letter formulates the problem of the learner performing covert optimization by dynamically incentivizing the stochastic oracle and obfuscating the eavesdropper as a ﬁnite-horizon Markov decision process (MDP). Using conditions for interval-dominance on the cost and transition probability structure, we show that the optimal policy for the MDP has a monotone threshold structure. We propose searching for the optimal stationary policy with the threshold structure using a stochastic approximation algorithm and a multi– armed bandit approach. The effectiveness of our methods is numerically demonstrated on a covert federated learning hate-speech classiﬁcation task.
more » « less
Full Text Available
Interpretable Deep Image Classification using Rationally Inattentive Utility Maximization

https://doi.org/10.1109/JSTSP.2024.3381335

Pattanayak, Kunal; Krishnamurthy, Vikram; Jain, Adit (January 2024, IEEE Journal of Selected Topics in Signal Processing)

Can deep convolutional neural networks (CNNs) for image classification be interpreted as utility maximizers with information costs? By performing set-valued system identifica- tion for Bayesian decision systems, we demonstrate that deep CNNs behave equivalently (in terms of necessary and sufficient conditions) to rationally inattentive Bayesian utility maximizers, a generative model used extensively in economics for human decision-making. Our claim is based on approximately 500 numerical experiments on 5 widely used neural network archi- tectures. The parameters of the resulting interpretable model are computed efficiently via convex feasibility algorithms. As a practical application, we also illustrate how the reconstructed interpretable model can predict the classification performance of deep CNNs with high accuracy. The theoretical foundation of our approach lies in Bayesian revealed preference studied in micro-economics. All our results are on GitHub and completely reproducible.
more » « less
Full Text Available

Search for: All records